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ESTIMATION  IN  PARAMETRIC  MIXTURE  FAMILIES 

by 

Alan  E.  Gelfand 

1.  Introduction 

For  mixture  distributions  of  the  form 

fe(z)  «  /f(z|n)dF0(n)  ,  (1) 

we  consider  estimation  of  g(0)  under  squared  error  loss. 
Conditions  for  the  identifiability  of  f0(z)  in  F0(n)  with 
respect  to  f(z|n)  are  discussed  in  Teicher  (1961). 

We  look  at  three  different  problems : 

(i)  In  Section  2  we  investigate  the  possibility 

of  uniformly  improving  upon  an  unbiased  estimator  of  g(6); 

(ii)  In  Section  3  we  offer  characterizations  of  Bayes 

rules  for  g(0)  as  well  as  a  simple  complete  class  theorem; 

(iii)  In  Section  4  we  investigate  the  performance  of 

empirical  Bayes  rules  generated  through  the  EM  algorithm. 

Results  will  generally  be  given  for  z,  n,  6  univariate  although 

multidimensional  extensions  are  available  in  some  cases. 

Our  primary  illustrative  examples  will  be  in  the  context 

of  the  noncentral  chi-square  distribution  where  f(zju)  is 
2 

*p+2n  an<*  Fe(n)  *s  ^oisson  with  intensity  parameter  6.  Motiva¬ 
tion  is  provided  by  recognizing  that  Z  is  inadmissible  for 
estimating  g(8)  *  E0(Z)  and  that  Z~ ^  is  inadmissible  for 


estimating  g(e)  «  Egd”1).  For  the  former  6(Z)  «  max(Z,p) 
clearly  dominates  Z.  For  the  latter  6(Z)  ■  min(Z“1,(p-2)”1) 
clearly  dominates  Z”1. 

2.  Improving  Upon  Unbiased  Estimators 

Let  T(Z)  be  an  unbiased  estimator  of  g(e)  and  let 
a(n)  *  E(T(Z) |n).  The  "associated  conditional  problem"  is  to 
estimate  a(n)  under  squared  error  loss  within  the  family 
f(z|n)*  We  have  the  following  result. 

Theorem  1:  T(Z)  +  c  <J>(Z),  c  >  0,  dominates  T(Z)  under 
squared  error  loss  if 

cov0(a(n),  E($ln))  <  0  v  e  (2) 

B.Hup  <  .  c/s  <3, 

*  n  E(*2  |n)  ” 

Proof.  By  direct  calculation  we  may  show  that  the  difference 
in  risk  between  T  and  T  +  c$  is 

-Egd^  (n  ) )  -  2c  cov0(a(n),  $(Z)) 

where 

I^(n)  -  c2E( «t>2 ! n)  +  2c  E[(T-a)*|n3  . 

But  (2)  is  equivalent  to  cov0(a(r»),  $(Z))  <_  0  while  (3)  implies 

I.(n')  <0  Vn  whence  T  +  c$  dominates  T.  0 
v  ~ 

Remark  1:  To  hope  to  satisfy  condition  (3),  we  require 
cov(T,$jn)  <  0  whence  it  is  convenient  to  choose  $  inversely 
related  to  T.  In  many  well-known  examples,  such  a  choice  of  q> 
leads  to  a  dominating  estimator,  e.g.,  if  g(e)  _>  C,  T+  dominates 
T  and  $  =  T+ 


-  T  is  decreasing  in  T. 


Remark  2:  *  decreasing  in  T  does  not  necessarily  imply 

condition  (2)  is  met.  However,  if  T|n  is  a  natural  exponential 
family  (Morris,  1982),  i.e.,  f (t | n)  -  etn“p(n),  then  a(n)  »  E(T|n) 
*  p*(n)  increases  in  n  while  for  E($|n),  -E^ ^ n -  «  cov(T,$|n)  <  0 
implies  E($|n)  decreases  in  n  so  that  (2)  holds. 

Remark  3:  If  T(Z)  is  admissible  for  a(n)  in  the  conditional 
problem,  it  may  or  may  not  be  admissible  for  g(e).  For  example, 
Z|n  'v  N(n,l)  and  n|e  ^  N(0,1),  then  Z  is  admissible  for  both 
a(n)  *  n  and  g(e)  *  e.  If  Z | n  'v  n*"1e”zn  and  n|e  ^  e‘*1e“ne  ,  Z/2 
is  admissible  for  a(n)  *  n  but  cZ,  0  <  c  <  1/2  dominates  Z/2  for 
g(e)  *  6. 

Remark  4:  If  T,  unbiased  for  a(n),  is  dominated  by  S  in 
the  conditional  problem,  it  is  possible  that  T  dominates  R  In 
the  unconditional  problem.  In  particular,  if  we  take  <j>  *=  s  -  T, 
c  *  1,  we  must  have  (3)  hold,  i.e.,  cov(T,$|n)  <  0,  Vn  while 
the  left-hand  side  of  (2)  is  sufficiently  positive  ve .  Examples 
can  readily  be  constructed  using  3  point  distributions  for  Z|n- 

Remark  5 :  In  the  preceding  remark,  S  will  dominate  T  in 
the  unconditional  problem  if  T|n  is  a  natural  exponential  family 
using  Remark  2. 

Suppose  instead  Fg(n)  is  a  natural  exponential  family  in  n 
dominated  by  y  with  density  f ( rj  |  e )  *  en0  and  without  loss 

of  generality  suppose  a(n)  *  n.  In  this  setting,  Karlin  (1958) 
supplies  conditions  such  that  cn  is  admissible  for  x’(e)  under 
squared  error  loss.  These  conditions  require  c  >  0  and 


where  (0,1>)  Is  the  natural  parameter  space  for  f(n|6)  and  60  Is 
an  arbitrary  interior  point. 

Suppose  cn  is  admissible  for  x’(®)*  Is  cT  admissible  for 
X’(0)?  Remark  3  shows  that  this  is  not  necessarily  the  case. 

We  can  show  that,  if  Karlin’s  conditions  hold  for  c 

Theorem  2 :  If 

E0  var(Sjn)  >  c2E0  var(T|n>  Ve  (5) 

then  S  cannot  dominate  cT  in  estimating  x'(6)* 

Proof.  The  proof  essentially  imitates  Karlin's  argument. 

Suppose  S  dominates  cT.  Let  bg(6)  *  E0(S)  -  cE0(T)  *  E0(S) 

» 

-  cx'(6)  whence  bs(6)  ■  cov0(n ,E(S | n) )  -  cx"(6).  Therefore 

(bs(6)  +  cx"(0))2  <  x"(6)  var0E(S|n) 
or 

(b’(e)  +  cx"(e))2 

— - -  +  E_  var(S|n)  <  var  (S) 

x"(e) 

and  finally 

P  (b'(e)  +  cx”(e))2 

E.(S-x’(0))  >— - -  ♦  Efi  var(S|n) 

6  "  x”(e)  6 

+  (bs(e)  +  (c-i)x’(e))2  . 

By  our  supposition  the  left-hand  side  of  this  ineauality  is 


at  most 
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Ee(cT-x’ (e))2  *  c2  vare(T)  ♦  [(c-l)x' (e)]2 

■  c2  Eevar(Tjn)  ♦  c2x"(e)  +  C(c-l)x’ (e)]2  . 

Using  (5)  we  obtain 

(b^(e)  ♦  cx"(e>)2 

-  +  (b.(e)  +  (c-i)x'(e))2 

x"<e)  s  (6) 

<  c2x"(e)  +  [(c-i)x’ (e)]2  ve  . 

Expression  (6)  is  equivalent  to  Karlin,  p.  413,  expression  (7). 
The  conditions  (4)  then  imply  bs(6)  i  0  and  S  *  cT.  0 

Remark  6:  This  result  shows  that  c'T  +  b  can't  dominate 
cT  if  c*  >  c. 


Remark  7:  If  T  is  MVUE  for  n,  it  is  MVUE  for  x’(e). 

Noncentral  distributions  offer  a  convenient  family  of 

mixtures  to  study  in  terms  of  applying  Theorem  1.  Gelfand  (19??) 

provides  many  examples.  For  the  noncentral  chi-squared 

~  ft 

distribution,  it  is  shown  that  4  of  the  form  Z  or  of  the  form 

BZ  1 

e  can  be  used  to  dominate  Z  in  estimating  E  (Z)  and  Z  in 

0 

estimating  E0(Z_1).  Thus  (Z-p)/2,  the  MVUE  of  e,  can  be  dominated 
by  estimators  of  the  form  (Z-p)/2  +  cZ®  and  of  the  form 
(Z-p)/2  +  ce  for  appropriate  c  and  0.  This  generalizes 
earlier  results  of  Perlman  and  Rasmussen  (1975)  and  of  Neff  and 
Strawderman  (1976).  Of  course  (Z-p)+/2  dominates  the  MVUE  as 
well  and  has  been  supported  by  Chow  and  Hwang  (19^3)  who  argue 


that  it  is  a  simple  estimator  which"cannot  be  improved  upon 


uniformly  and  significantly"  and  by  Saxena  anc  Alam  (1982)  who 


6 


show  that  It  dominates  the  MLE  for  6.  We  will  return  to  this 
estimator  In  Section  <t. 

This  section  provides  a  method  for  uniformly  improving 
upon  unbiased  estimators  In  a  mixture  distribution  framework. 
Dominating  biased  estimators  is  a  more  difficult  problem. 

3-  Bayes  Estimation 

Suppose  we  let  t^O)  be  a  family  of  prior  distributions 
for  6  e  0.  Under  squared  error  loss  the  generalized  Bayes 
estimator  for  g(8)  is 

<5y(z)  =  (fy(z)  )”1/g(8)fg(z)dT^,(0)  (T) 

where  f^(z)  =  /fg(z)dT^(0)  is  the  marginal  distribution  of  Z. 

2 

If  the  support  of  Is  0  and  if  E^g  (8)  <  ®,  then  the 
Bayes  risk  of  6^  is  finite  and  6^  is  admissible. 

Let  Fg(n)  in  (1)  be  dominated  by  y  with  RN  derivative 
f (n | 6)  •  Then 

//g(8)f(z|n)f(n|8)dy(n)dt(e) 

6  (z)  «  — - 1 - 

//f(z|n)f(n|e)dv(n)dT  (e) 

,  /b^(n)f(z  |n)dTrv(n)  (£ ) 

f  f (z  In)d-n^(n) 

where  is  the  prior  distribution  induced  on  n  by  t  ,  i.e., 

dtr^Cn)  *  h^(n)dy(n)  and  h^(n)  *  Jf (n | 0 )dr  ( 6) ,  with 

bY^n)  *  h"1(n)/g( e)f (n ! e )dT^( e )  , 

i.e.,  b  (n)  *  E  (g(8) |n). 

y  y 


(9) 
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Express In  (8)  shows  that 

«y(z)  -  E(g(e) |z)  -  E(bY(n)|z)  ,  do) 

i.e.,  we  can  calculate  the  Generalized  Bayes  rule  through  the 

conditional  problem.  If  we  let  c^n)  *  EY(g2(6)|n),  then 

6^(2)  has  finite  Bayes  risk  if  Ec^Cn)  <  ®. 

Let  P(n|6)  be  a  natural  exponential  family  as  in  the 

previous  section.  If  g(e)  *  er,  then  b  (n)  *  h_1(n)  d  hY^n^  ; 

Y  Y  j  r 

t*  fi  .  i  ari 

if  g(e)  *  e  ,  b^Cn)  *  h^  (n)hY(n+r) . 

% 

Gelfand  (1983)  again  offers  examples  using  noncentral 
distributions.  For  Z  distributed  noncentral  chi-squared  as  in 
the  introduction, 

V>  •  ^  dVe) 

and 

Zb  (n)(z/2)n  h  (n)[r(p/2  +  n)]"1 

6  (Z)  -  n-L-r - 1 - - -  •  (ID 

y  z(z/2)n  h  (n)  [r(p/2  +  n)]”1 
n  y 

If  g(6)  *  6r,  b^Cn)  *  (n+r)r  h^tnOh^n+r).  ((x)y  denotes 
the  falling  factorial  of  y  terms  starting  at  x.)  At,  for  example, 
r  *  1  (11)  becomes 

dJ^(Z)  dJ  (Z) 

5  (Z)  =  2Z  ■  U  ■  J  A(Z)  +  — 1 -  J'2(Z)  (12) 

Y  dZ^  Y  dZ  Y 

where  Jy(Z)  is  the  denominator  of  (11).  Expression  (12) 
characterizes  all  generalized  Bayes  estimates  of  6.  Gelfand 
notes  that  setting  6y(Z)  =  (Z-p)/2,  the  KV’JE  or  £•,  in  (12), 
yields  a  second  order  homogeneous  linear  differential  equation 
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which  is  not  solvable,  i.e.,  the  Inadmissible  MVUE  can't  be 
generalized  Bayes. 

A  convenient  family  for  t^(6)  are  the  distributions 
Gamma  (p/2  +  v,y)»  v  >  -(p-2)/2  (i.e.,  E(6)  *  y  ^(p/2  +  v)) 
whiclj  for  g(6)  *  6,  leads  to 

l  dJ  (Z) 

\  V(Z)  -  (p/2  4  v  +  Z  — * -  J'^Z))  .  (13) 

T '  T  dZ  Y 

At  v  ■  0  we  obtain  the  closed  form  6  n(Z)  =  (y+1  )-1(  (y+1)”1Z+p)/2 

Y » u 

Including  the  generalized  Bayes  solution,  6^  Q(Z)  =  (Z+p)/2. 

These  estimators  are  discussed  in  Perlman  and  Rasmussen  (1975) 
and  in  Saxena  and  Alam  (1982)  who  note  that  6Q  Q(Z)  is  dominated 
by  the  KVUE.  It  is  straightforward  to  show  that  6^  V(Z) 
increases  in  v. 

Returning  to  the  rules  in  (7),  we  ask  if  they  form  a  complete 

class.  The  conditional  problem  is  not  useful  here  because  in  the 

representation  of  6^  in  (8),  b^  depends  upon  the  particular 

prior,  t  .  We  attack  the  problem  directly  utilizing  results  of 

Sacks  (1963).  His  Remark  3,  p.  766,  argues  that  with  g(e)  *  0 

the  class  (7)  will  be  complete  under  squared  error  loss  if  f  (z) 

0 

is  continuous  in  both  6  and  z  and,  assuming  -  ®  <  e  <  <*>,  for 
each  e  >  0 

e2fe(z+e) 

sup  -  <  00 

e  <  0  fg(z) 

e2fp(z-e) 

lim  sup  - 

d  -  «  6  >  d  fe(z) 


,  sup 
e  >  0 


e*f0(z-e) 

Vz) 


<  00 


lim  sup 
d  ■>  «  6  <  -  c 


e^fe(z+c) 
fe(  -  ) 


(1*0 


=  0 


9 

Condition  (14)  will  be  satified  if  fg( z+e «  0  (6”2)  Vz 
and  e, -•  <  e  <  •.  As  noted  by  Sacks,  parts  of  (14)  are  not 
needed  if  6  belongs  to  a  subset  of  R1.  We  have  the  following 
result. 

Theorem  3:  If  FQ(n),  -  «  <  n  <•,  is  a  natural  exponential 
family  dominated  by  p,  a  translation  invariant  measure,  and  if 
f(z|n)  *  f(z-n),  i.e.,  a  translation  family,  then  (7)  with 
g(e)  *  6  is  a  complete  class  for  estimating  6  under  squared 
error  loss. 

Proof.  By  assumptions 

02f e ( z+e )  e2/f(z  +  e  -  n)en6  dp(n)  2  0e 
f0(z)  / f(z  -  n)en6  dp(n) 

from  which  the  conditions  in  (14)  are  immediately  satisfied.  0 

4.  Parametric  Empirical  Bayes  Estimation 

In  the  parametric  empirical  Bayes  approach,  y  in  (7)  is  assumed  unknown 

and  is  estimated  from  the  data  using  the  marginal  distribution 
of  2  If  y  estimates  y  the  resultant  empirical  Bayes  estimator 
is  6~(z).  The  maximum  likelihood  estimator  (MLE)  is  a  frequently 
employed  choice  of  y.  In  our  setting  this  requires  a  very 
unappealing  numerical  maximization  of  a  double  integral.  This 
would  typically  be  accomplished  by  algorithms  of  a  Newton  or 
quasi-Newton  type.  Such  algorithms  do  not  guarantee  to  increase 
the  likelihood  at  successive  iterations.  The  EK  algorithm  as 
described  in  Dempster  et  al.  (1977)  offers  an  attractive  alternative. 
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To  fix  notation,  let  be  dominated  by  w  with  RN 

derivative  t^(6).  At  stage  k  the  algorithm  calculates  an 

expectation  Q(Y»Yk)  «  E  log(f  (Z,6) |z)  and  then  maximizes 

k  yk  Y 

Q(Y,Yk)  over  all  y-  In  our  case,  this  simplifies  to 


max  / f  (6 |z) -log  t  (6)dw(e)  . 

Y  Yk  Y 


This  new  y  is  denoted  by  Yk+1  and  the  algorithm  is  repeated 
until  stability  is  achieved.  Such  a  procedure  by  its 
definition  may  be  shown  to  increase  the  likelihood  with 
successive  iteration  (see  Dempster  et  al.,’l977).  Wu  (1983) 
shows  that  under  minimal  assumptions  such  a  procedure  yields 
a  stationary  value  for  f^(z).  He  recommends  several  EM 
iterations  be  tried  with  different  starting  y  representative 
of  the  parameter  space  to  try  to  identify  local  and  hopefully 
a  global  maximum.  Redner  and  Walker  (198t)  extensively  discuss 
the  use  of  the  EM  algorithm  for  maximum  likelihood  estimation 
in  mixture  distributions.  Their  focus,  however,  is  on  the  MLE 
for  0  in  fg(z)  as  in  (l)  with  fg(z)  being  a  finite  mixture  density  and 
0  a  vector  including  parameters  of  the  distributions  being  mixed. 

If  t^  is  an  exponential  family,  i.e.,  t  =  c(y)eYC1^  ,  the  algorithm 


simplifies  to  maximizing  log  c(y)  +  Yq,  where  q,  =  E  (q(0)|z).  Then  y, 

K  &  T  K  +  l 


is  a  solution  to  -c'(y)c_1(y)  =  q^.  In  fact,  q^+1  =  E.^  ^(q(6)|z)  which 


reduces  the  algorithm  to  a  stationary  or  fixed  point  problem.  The  conditional 


representation  of  E^,(q(9)|z)  as  E^(b^,(ri)|z)  noted  in  (8)  is  useful  here  in 


r* v-n,- 


reducing  the  computation  needed  for  the  repeated  calculation 
of  the  expectation  required  by  the  algorithm. 

As  an  example  we  return  to  the  noncentral  chi-squared  case 
under  the  assumption  leading  to  (13)  to  obtain  the  empirical  Bayes 
estimate  $0  V(Z).  It  is  clear  that  direct  calculation  of  the  MLE,  y  ,  is 
difficult.  However,  since  q(9)  =  -6',  expression  (13)  up  to  a  sign  change 
sets,  qk  =  0k  for  a  given  Yk-  But  since  E^O)  =  y-1(p/2  +  v)  ,  we  have 

Vi 5  Y(V  =  ei1<p/2  + v)- 

Writing  this  explicitly  as  a  fixed  point  problem,  we  have 


6 


e 

6  +  p/2  +  v 


p/2  +  v  + 


If  , 

ze 

n(2TT 

+  p/2  + 

-n^vvn; 

I  ( 

ze 

^  ^  *  o  /  ~  \  1 

rr  2(e 

+  p/2  + 

v' 

(15) 


where  £v ( n )  =  T(p/2  +  n  +  v)  [r(p/2  +  r»)*n .’l"1.  Let  the  right- 
hand  side  of  (HO  be  denoted  by  W  (6jz).  We  may  show  that 
(i)  Wv(0;z)  =  0,  i.e.,  for  any  z,  0  is  a  fixed  point; 

(li)  Wy(6;z)  increases  in  6  for  fixed  z; 

(iii)  W^(6;z)  is  bounded  for  fixed  z; 

(iv)  For  6  small  Wv(6;z)  '  6  regardless  of  z; 

(v)  W  ( 6 ; z )  has  at  most  one  positive  fixed  point  for  a 
fixed  z. 

Hence,  given  v  and  z,  a  plot  of  Wv(6;z)  vs.  e  assumes  one 


of  the  two  forms  in  Figure  1. 


FIGURE  1(a) 


e-> 


e- 

FIGUF.E  Kb) 
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With  increasing  z  the  plot  will  change  from  Figure; (a)  to 
Figure  Mb).  As  a  result,  if  we  employ  the  EM  algorithm,  we 
will  find  that  for  any  starting  6  under  Figure  (a)  and  for 
starting  e  sufficiently  small  under  (b),  ek  0,  i.e.,  yk  •  . 
Moreover,  because  of  (iv),  the  convergence  will  be  extremely 
slow  (e.g. ,  3,000  iterations  may  yield  ek  of  the  order  of  10“2). 

A  A 

It  is  noteworthy  that  when  6k  -*•  0,  i.e.,  e  *  0  (y  *  »)  we,  in 
fact,  minimize  f^(z),  i.e.,  the  EM  algorithm  will  fail  to 
maximize  the  likelihood.  But  in  terms  of  empirical  Bayes 
estimation,  from  (13),  6^  v(z)  *  0  is  a  reasonable  guess  for  e 
if  z  is  sufficiently  small.  If  there  is  a  fixed  point  6  >  0, 
the  corresponding  y  must  be  the  MLE.  In  implementing  the  EM 
algorithm,  we  should  begin  with  y  small,  i.e.,  6  large  to  insure 
finding  this  fixed  point  if  it  exists.  Moreover,  if  after,  say 

_  p 

300  iterations  e30Q  is  small  (say  10  )  and  decreasing,  we  will 

A  ^ 

conclude  that  6k  -*•  0.  (It  is  possible  that  the  nonzero  fixed 
point  lies  below  e30Q,  but  practically  this  is  of  little  concern.) 
The  maddeningly  slow  convergence  of  the  algorithm  even  to  the 
unique  MLE  may  make  the  following  alternative  attractive.  Using  a 
rough  plot  of  Wv(6;z)  versus  e  for  a  few  choices  of  z  should 
enable,  when  z  is  sufficiently  large,  identification  of  an 
appropriate  initial  6  to  Insure  convergence  of  y^  to  the  MLE 
or  to  conclude  that  z  is  sufficiently  small  so  that  0,  -*•  0. 

K 

We  note  that  for  any  z  and  any  starting  y  the  empirical  Bayes 
estimator  resulting  from  this  algorithm  will  be  >  0  and,  in  fact. 
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If  v  ■  0,  the  right-hand  side  of  (13)  simplifies  and 
6Y,0<2)  *  <Z“P)+/2  <ln  fact,  y  •  Cp(Z-p)"1]+) ,  l.e. ,  the 
positive  part  version  is  empirical  Bayes.  Perlman  and  Rasmussen 
(1975)  observed  that  the  MVUE  is  itself  empirical  Bayes  if  we  take 
ft(Z)  with  y  *  p( Z-p)*^. 

An  extensive  simulation  was  conducted  to  study  the  risk 

behavior  of  the  y(z).  Cases  p  «  6  with  v  *  -2, -1,0, 1,2, 5,10 

and  p  *  12  with  v  *  -4,-2,0,2,4,10,20  were  examined  using  5,000 

replications.  The  algorithm  was  allowed  300  iterations  on  each 

-4 

replication.  Convergence  was  declared  if  |*Yk+1  -  |  <10 

If  for  any  k,  ek  <  10  ,6=0  was  taken  as  the  estimate.  If 

the  algorithm  failed  to  converge  after  300  iterations,  e^oo  was 
taken  as  the  estimate.  Starting  points  of  (i)  yQ  =  1  and  (ii) 

Y0  *  (Z-p)-1(p+2v)  if  Z  >  p,  Yq  *  1  if  Z  <  p,  were  tried.  The 
choice  Yq  *  1  may  be  viewed  as  the  "center"  of  the  parameter 
space  in  that  it  corresponds,  for  the  induced  negative  binomial 
prior  on  n,  to  a  success  probability  of  .5.  The  choice 
Y0  •  (Z-p)_1(p+2v)  arises  from  the  fact  that  E^(Z)  «  p  +  y_1(p+2v). 
Both  starting  values  were  successful  in  obtaining  the  unique 
MLE.  However,  (ii)  tended  to  converge  more  quickly.  Convergence 
tended  to  be  slower  with  increasing  v  although  more  frequently 
to  the  unique  MLE  than  to  the  fixed  point  at  0,  i.e.,  more 
frequently  we  would  be  in  the  case  of  Figure  1(b).  Increasing  p 
from  6  to  12  led  to  quicker  convergence  again  more  frequently 


to  the  MLE. 


RMS 


RM6E 


6 
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Figures  2  and  3  display  the  results  for  p  *  6  and  p  »  12, 
respectively,  in  terms  of  relative  mean  square  error  (RMSE) , 
i.e.,  the  MSE  of  6-  relative  to  p/2  +  26,  that  of  the  MVUE. 

In  Figure  1,  we  present  6  values  of  v  surrounding  v  *  0  which 

we  recall  yields  (Z-p )+/2.  In  Figure  2,  we  simplify  to 

v  =  -2,0,2.  It  is  noteworthy  that  the  v  *  1,2  estimates  not  previously 

discussed  in  the  literature  dominate  the  positive  part  MVUE  except  for  9 
small. 
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